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Abstract 

Polynomials of independent random variables arise in a variety of fields such as Machine 
Learning, Analysis of Boolean Functions, Additive Combinatorics, Random Graphs Theory, 
Stochastic Partial Differential Equations etc. They naturally model the expected value of objec- 
tive function (or lefthand side of constraints) for randomized rounding algorithms for non-linear 
optimization problems where one finds a solution of an "easy" continuous problem and rounds 
it to a solution of a " hard" integral problem (one such example is Convex Integer Programming 
1^). To measure the performance guarantee of such algorithms one needs analogously to the 
analysis employed by Raghavan and Thompson for boolean integer programming problems 
an analog of Chernoff Bounds for polynomials of independent random variables. There are many 
known forms and variations of Chernoff Bounds. One of the tightest ones is based on a variance 
of a sum of random variables known as Bernstein inequality. Another popular albeit a weaker 
version is using an estimate of a variance through the expectation. The later versions of concen- 
tration inequalities for polynomials of independent random variables are known |p^ , ^ . In this 
paper we derive an analog of Bernstein Inequality for multilinear polynomials of independent 
random variables. 

We show that the probability that a multilinear polynomial / of independent random vari- 
ables exceeds its mean by A is at most e~'^ /(fl'^ar(/)) sufficiently small A, where R is an 
absolute constant. This matches (up to constants in the exponent) what one would expect from 
the central limit theorem. Our methods handle a variety of types of random variables including 
Gaussian, Boolean, exponential, and Poisson. Previous work by Kim-Vu and Schudy-Sviridenko 
gave bounds of the same form that involved less natural parameters in place of the variance. 

1 Introduction 

Polynomials of independent random variables arise in a variety of fields such as Machine Learn- 
ing, Analysis of Boolean Functions, Additive Combinatorics, Random Graphs Theory, Stochastic 
Partial Differential Equations etc. They naturally model the expected value of objective function 
(or lefthand side of constraints) for randomized rounding algorithms for non-linear optimization 
problems where one finds a solution of an "easy" continuous problem and rounds it to a solution 
of a "hard" integral problem (one such example is Convex Integer Programming [Q). To measure 
the performance guarantee of such algorithms one needs analogously to the analysis employed by 
Raghavan and Thompson for boolean integer programming problems an analog of Chernoff 
Bounds for polynomials of independent random variables. There are many known forms and vari- 
ations of Chernoff Bounds. One of the tightest ones is based on a variance of a sum of random 
variables known as Bernstein Inequality |3|, |l9| . Another popular albeit a weaker version is using 
an estimate of a variance through the expectation. The later versions of concentration inequalities 
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for polynomials of independent random variables are known |T8[. In this paper we derive an 
analog of Bernstein Inequality for multilinear polynomials of independent random variables. 

Perhaps the most celebrated theorem in statistics is the central limit theorem. This theorem (ac- 
tually family of theorems) states conditions under which a sum of n independent random variables 
converges to being normally (i.e. Gaussian) distributed as n — ?• oo. Let Yi, Y2, ... be a sequence of 
independent random variables. Let Var[Z] = E [(Z - E [Z])^] = E [Z^] - E [Zf be the variance 
of the random variable Z. Various central limit theorems state various conditions on the Yi under 
which 
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for any A G M. One sufficient condition is that the Yi are identically distributed with finite variance 
1^]. Another set of sufficient conditions is that there exists M, e > such that all E [lYip"'"^] < M 
and lim„,^oo ^a'^ELi = 0° M- 

The rate of convergence of the limit is often of interest. The Berry-Esseen theorem states that 
when the Yi are identically distributed with finite E [|yLp] and E [yp] = then 
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Many applications require upper bounds on the probability of large deviations for finite n. The 
Berry-Esseen bound (^]^) is exponentially far from tight for many such applications, for example 
the probability that at least three-quarters of a sequence of coin flips are heads is 2~®(") but the 
Berry-Esseen bound is 0{l/^/n). Fortunately it is possible to do much better in many cases. For 
example if the Yi are independent random variables with < 1^ < 1 then a standard Bernstein 
inequality (e.g. Theorem 2.3 (b) in [^]) states that 
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for any A > where /i = E Yi]. Note that the small-A probability bound is roughly exp (^—2/7 

which matches the Gaussian behavior suggested by the central limit theorem except for the use of 
the upper-bound for the variance /U in place of the variance. This discrepancy can be remedied, 
yielding another variant of the Bernstein inequality (see Theorem 2.7 in |15|) 
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where V = Vari^^^^Yi). For X <V this matches (up to constants in the exponent) what the 
central limit theorem suggests. 



Kim and Vu introduced variants of Chernoff bound (1.3) for polynomials of independent Boolean 
random variables WM. Vu pCl] tightened and generalized the bounds to handle independent random 



variables with arbitrary distributions in the interval [0,1]. Schudy and Sviridenko [18| proved a 
stronger concentration inequality for polynomials of independent random variables satisfying a 
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general condition (see Definition 1.1). Note that [20| contains one extension not handled in ||ll 
and this paper, namely using less then q (the degree of the polynomial) smoothness parameters. 
These bounds share the Gaussian-like behavior for small A with ( |1.3D , but they use an upper bound 
on the variance that is more complicated than the /i used in ( |1.3| ). The behavior for large A is also 
different. Our main contribution is an analog of ( [L.4| ) for polynomial fiY) of power q: 

Pr [|/(r) - E I > Al < exp (l-^) 



for all sufficiently small A (see Theorem |1.3| for the precise statement), where R is an absolute 
constant. What values of A are "sufficiently small" depends on parameters /ii, /i2, • • • , /^g defined in 



the next section. For example in the setting of (1.4) we reproduce that bound up to constants in 



the exponent: Gaussian-like tails for \ <V and exponential tails for larger A. Some polynomials 
require A to be so small that e^e~^ l^'' always exceeds 1 and hence (^) is vacuous. We expect 
that most applications will involve A sufficiently small for (|1.5| ) to apply. 

The improvement of (|1.5|) compared to the concentration inequalities in Schudy-Sviridenko 111 



is analogous to the improvement of (L4) compared to (11.31) . There are countless applications of 
Bernstein Inequality (or its variants known as Chernoff or Hoeffding Bounds) and its martingale 
versions ^. Recent algorithmic applications of the martingale version of Bernstein Inequality 
that require dependence on variance instead of expectation are Q and |14]. Analogously, we expect 
that in the future there will be many applications (e.g. counting in random graphs) where one would 



necessarily need a stronger inequality of Theorem |1.3| (analog of (L4) for polynomials) instead of 



Theorem 1.4 (analog of (1.3) for polynomials). Note that before our work such statements were 



not even known for boolean random variables. 
1.1 Our Results 

We are given a hypergraph H = {V{H),T-L{H)) consisting of a set V{H) = {1,2, . . . ,n} = [n] of 
vertices and a set T-l{H) of hyperedges. A hyperedge h consists of a set V{h) C V{H) of |V(/i)| < q 
vertices. We are also given a weight Wh for each h £ T-L{H). For each such weighted hypergraph 
and real-valued weight Wh for its hyperedges, we define a multilinear polynomial 

f{x) = X] n 

We call the maximum hyperedge cardinality q the power of the polynomial /. 

We use essentially the same smoothness parameters as Kim and Vu [^, ^] in our previous work 
|18|. For a given collection of independent random variables Y = (Yi, . . . ,Yn), hypergraph H, 
weights w and integer parameter r > 0, we define 

Hr = fJ-r{Y,H,w) = max \wh\ TT ^[\Yy 

" ' \he'H\V(h)DS veV{h)\S 

Note that 5 need not be avertex set of some hyperedge of H and may even be the empty set. 
Sometimes we will also use the notation ^r{f) = [JriY, H,w) to emphasize the dependence on 
polynomial /. 



In the previous work [|18[, we proved moment and concentration inequalities that could be viewed 
as an extension of (|1.3|) to polynomials of random variables satisfying the following condition. 



Definition 1.1 A random variable Z is called moment bounded with real parameter L > 0, 
if for any integer i > 1 we have 
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That work [18| showed that three large classes of random variables are moment bounded: bounded, 
continuous log-concave ||] and discrete log-concave The results of the current paper apply 
to a related type of random variable. 

Definition 1.2 A random variable Z is called central moment bounded with real parameter 
L > 0, if for any integer i > 1 we have 



E 



\Z-E[Z]\' 



< i • L • E 



\Z-E[Z]\ 



i-1 



In Section |^ we show that the three classes of random variables that are known to be moment 
bounded (i.e. bounded, continuous log-concave and discrete log-concave) are also central moment 
bounded. For example Poisson, geometric, normal (i.e. Gaussian), and exponential distributions 
are all central moment bounded. 

We prove the following: 

Theorem 1.3 We are given n independent central moment hounded random variables Y = (Yi, . . . , Yn) 
with the same parameter L. We are given a multilinear polynomial f{y) of power q. Let f(Y) = 
f{Yi,...,Yn) then 
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where R is some absolute constant. 

Quite often in the applications /i^ for r = 1, ... ,g — 1 are negligibly small and fig = 1 (e.g. |^ 



In this case, the right hand side of ( |1.7D becomes 



e^-max<^e v'ar[/(y)].fl9 e~T7R 



and our Theorem |1.3| implies that tails of multilinear polynomials of central moment bounded 
random variables have Gaussian-like distribution for A < Var[f{Y)]'^^^'^'^~^') and constants L,R. 
Previous work [^] proved a similar theorem: 



Theorem 1.4 JT^/ We are given n independent moment bounded random variables Y = (Yi, . . . , Yn] 
with the same parameter L. We are given a multilinear polynomial f[x) with nonnegative coeffi- 
cients of total power q. Let f{Y) = /(Yi, . . . , Yn) then 
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where R is some absolute constant. 
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We show that the parameter Var[f{Y)\ in Theorem L3 is always at least as good as the max^gj^j {^q^^L^ 
in Theorem 1.4: 
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Lemma 1.5 For a multilinear polynomial f as in Theorem we have 



Var[f{Y)] < 2q4i max (/io(/, Y)Mf, ■ 



Lemma L5 implies that om' Theorem L3 dominates Theorem lA from [18| in the common case when 
the central moment boundedness parameter L is of the same order as the moment boundedness 
parameter. 



Previous work [^] showed that Theorem 1.4 has a tight dependence on the parameters fii, . . . ,fiq 



up to factors of logarithms and g'^('j) in the exponent. That lower bound only applies to bounds 



that depend only on those parameters and hence Theorem L3, which additionally depends on 
Var[f{Y)], does not contradict it. 

1.2 Comparing with Hypercontractivity concentration inequality and other re- 
sults 

It is well known that considering sums of centered (i.e. E [Y] = 0) and subgaussian (i.e. E [jl^l'^] < 
L'^/c'^/^E [|X|]) random variables improves the concentration bounds. Namely the concentration 
arounds its mean stays gaussian even for large values of A unlike the case of the sum of non- 
centered (even boolean) random variables where the concentration bounds start to behave like 
the ones of exponential random variable. Therefore, we can expect a similar phenomenon for the 
polynomials of independent centered subgaussian random variables, i.e. the concentration bounds 
for polynomials of independent centered subgaussians should have tighter concentration around the 
mean for larger values of A. 

Two specific examples of such variables are centered Gaussian and Rademacher (+1 or —1 with 
probability 1 /2) random variables. There are two concentration inequalities known in the literature 
specific for that setting 

Theorem 1.6 (Hypercontractivity Concentration Inequality) Consider a multilinear degree 
q polynomial f{Y) = /(Yi, . . . , Yn) of independent Normal or Rademacher random variables Yi, . . . , Yn- 
Then 



Pr[\f{Y)-E[f{Y)]\>\]<e'-e 



a2 



where Var[f{Y)] is the variance of the random variable f{Y) and R > is an absolute constant. 

The history of these concentration and corresponding moment inequalities is quite rich see S. Janson 
|lO| (Sections V and VI). Latala ||l^ tightened these inequalities for Normal random variables using 
smoothness parameters similar but incomparable to ours (see the next Section). 

Unfortunately, the Hypercontractivity and even Latala Concentration Inequalities do not strictly 
dominate our concentration inequality (Theorem [L.3| ). Our concentration behaves better for small 
values of A with respect to Hypercontractivity Concentration Inequality and for some polynomials 
we beat the Latala bounds for large values of A since our smoothness parameters are incomparable. 

The conclusion is that it is likely that there exists a yet to be discovered concentration inequality for 
polynomials of independent centered subgaussian random variables that dominates ours (Theorem 



1.3), Hypercontractivity (Theorem L6) and Latala's concentration inequalities in this setting 



Deriving such an inequality is a challenging open problem. 



In our previous work we provide an extensive comparison of Theorem 1.4 and its analog 



for general polynomials with various known concentration inequalities for polynomials. Mossel, 
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O'Donnell and Oleszkiewicz |16| showed that the distribution of a multihnear polynomial of inde- 
pedent random variables is approximately invariant with respect to the distribution of the random 
variables as long as the random variables have mean and variance 1. In particular they bound 

|Pr [/(Xi, . . . , X„) > A] - Pr [f{Gi, . . . , G„) > A]| 

where / is a multilinear polynomial, Xi, . . . , Xn, Gi, . . . , Gn are independent random variables 
with mean and variance 1, and Gi, . . . , Gn have a Gaussian distribution. Such bounds can be 
considered to be a generalization of the Berry-Esseen type bounds because in the linear case the 
sum of Gaussians f{Gi, . . . ,Gn) has a Gaussian distribution. Note that as usual central limit 
theorem or invariance principle type of results have very wide range of applicable random variables 
but weaker concentration bounds (polynomial instead of exponential) . 

1.3 Our Techniques 

Our work follows the same general scheme of the moment computation method developed in the 
proof of Theorem lA in [^] but there are many subtle differences in the proofs since we basically 
want to replace each term fioHq in the proof of the Initial Moment Lemma from with 



variance. For example, our Section 2^ and the analogous Section 2.2 in [18| are devoted to bounding 
a certain sum (the sum over vr in (2.12)). Previous work |18| gets a factor nt for some < t < q for 
each hyperedge in a certain hypergraph G' . Each connected component of G' includes two hyperedge 
weights, call them wi and 'W2, which contribute to a fiQ and a fig factor respectively in Instead 
of having wiW2 contribute to a Hoj-iq we do the following. We bound wiW2 < {wf + W2)/2 and then 
wf and W2 each contribute to a factor of variance. Our Ordering Lemma in Section ^ is different 
from the analogous statement in jl8U . To transition from Initial Moment Lemma to the General 
Even Moment Lemma we use certain orthogonality properties of multi-linear polynomials which do 
not seem to hold for general polynomials. Our key property of random variables (central moment 



boundness) is different from the moment boundness in |18] which forced us to re-prove that the 
classical classes of discrete and continuous random variables satisfy that property. 



While we were able to extend the Theorem lA in [18| to the case of general polynomials we were not 



able to prove a similar extension of Theorem 1.3. While we believe that such a statement is true, 



it seems it would require another property of random variables different from moment boundness 
or central moment boundness. 

1.4 Outline 



The high-level organization of our analysis follows 1 18 1 . Sections |2| and ^ state and prove key lemmas 
on the moments of polynomials of variables with zero expectation. Section ^ proves various technical 
lemmas that are omitted from the main flow. Section ^ states and proves bounds on the moments 
of polynomials with arbitrary expectation. Section |^ uses those bounds and Markov's inequality to 
prove Theorem |1.3| . Section |^ shows that a wide variety of classical random variables are central 
moment bounded. 



2 Moment Lemma for Centered Multilinear Polynomials 



The proof of the Theorem |l.3| will follow from the application of the Markov's inequality to the 
upper bound on the k-th moment of the polynomial in question. The first step is to look at moments 
of "centered" multilinear polynomials that replace with Yy — E [Y^]. 
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Lemma 2.1 (Initial Moment Lemma) We are given a hypergraph H = {[n],T-L), n independent 
central moment hounded random variables Y = (Yi, . . . ,Yn) with the same parameter L > and 
a polynomial g{x) with nonnegative coefficients Wh > such that every monomial (or hyperedge) 
h £ v. has power (or cardinality) exactly q. We define random variables = Yy — K[Yy] for 
V £ [n]. Then for any integer k > 1 we have 



E 



I < m_ax ji^f L^^-^'^o-^ . fc9fc-(9-i)'^o-^ . Var[g{X)]''° ■ (^^fit{g,Yy 



(2.8) 



where > 1 is some absolute constant, I = Ylt=oi^ ~ ^)^t, <ind a = {aQ, . . . , cTg). The maximum 
is over all non-negative integers at, < t < q satisfying 2aQ + Ylt=i = k and £ < qk/2. 

Note that the constraint 2ao + Ylt=i o't = k in Lemma imphes do < k/2 hence the powers of L 
and k in ( |2.8D are non-negative. 



Proof. Fix hypergraph H = {[n],T-L), random variables Y = (Yi, . . . , 1^), non-negative weights 
{wh}henj an integer k and total power q. Without loss of generality we assume that 7i is the 
complete uniform hypergraph (setting additional edge weights to as needed), i.e. 71 includes 
every possible hyperedge over vertex set [n] with q vertices. Note that the the cardinality of the 
hyperedge is equal to the total power of the corresponding monomial in the polynomial g{x). 

A labeled hypergraph G = {V{G),'H{G)) consists of a set of vertices V{G) and a sequence of k (not 
necessarily distinct) hyperedges 'H{G) =< >. In other words a labeled hypergraph is 

a hypergraph whose k hyperedges are given unique labels from [k] . We write e.g. Ylh&H{G) ''^h as 
a shorthand for Y\i=i "^hi where 7i{G) =< hi, ... ^h^ >; in particular duplicate hyperedges count 
multiple times in such a product. 

Consider the sequence of hyperedges hi,...,hj^ G H from our original hypergraph H. These 
hyperedges define a labeled hypergraph H{hi, . . . , h^) with vertex set U^!^^V(/ii) and hyperedge 
sequence hi, . . . ,hk. Note that the vertices of H{hi, ... ,hk) are labeled by the indices from [n] and 
the edges are labeled by the indices from [k]. Note also that some hyperedges in H{hi, . . . , hk) could 
span the same set of vertices, i.e. they are multiple copies of the same hyperedge in the original 
hypergraph H. Let V{H, k) be the set of all such edge and vertex labeled hypergraphs that can be 
generated by any k hyperedges from H. We say that the degree of a vertex (in a hypergraph) is the 
number of hyperedges it appears in. Let 'P2{H, k) C V{H, k) be the set of such labeled hypergraphs 
where each vertex has degree at least two. We split the whole proof into more digestible pieces by 
subsections. 

2.1 Changing the vertex labehng 

In this section we will show how to transform the formula for the fc-th moment to have the sum- 
mation over the hypergraphs that have its own set of labels instead of being labeled by the set [n]. 
Let Xy = — E \Yy\ for v £ h. By linearity of expectation, independence of random variables Xy 
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for different vertices v £V and definition of V{H, k) we obtain 
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/iG-H(G)|t)GV(/i) 
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hen{G)\veV{h) 

n ^ 




(2.9) 
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GeV2(H,k) 



n ^ 

yhen{G) 



n 

•i.GV(G) 



(2.10) 



where the equahty ( |2.9| ) follows from the fact that E[X,,] 
degree of vertex v in the hypergraph G. 



for all V G V (/i) and dv{G) is the 



Note that a labeled hypergraph G G V2{H, k) could have the number of vertices ranging from q up 
to kq/2 since every vertex has degree at least two. For k and q clear from context, let S2{t) be the 
set of labeled hypergraphs with vertex set [£] having k hyperedges such that each hyperedge has 
cardinality exactly q and every vertex has degree at least 2. For each hypergraph G € S2{t) the 
vertices are labeled by the indices from the set [P\ and the hyperedges are labeled by the indices 
from the set [k]. Let M{S) for S C [£] be the set of all possible injective functions vr : 5—7- [n], 
in particular M([£]) is the set of all possible injective functions vr : [£] — )• [n]. We will use the 
notation 7r(/i) for a copy of hyperedge h G T-L{G) with its vertices relabeled by injective function vr, 
i.e. V(7r(/i)) = {^{v) : v £ V{h)}. We claim that 



\X. 



7r(«) I 



\du{G') 



(2.11) 



E n n ^[i^^i'"^'^ 

G€V2{H,k) \heH(G) I \t;GV(G) 

= E^ E E f n -^i>^ ( n ^ 

e=q ' G'eS2{e)7r€M{[e]) \hen{G') / \«gV(G') 

Indeed, every labeled hypergraph G = {V{G),'H{G)) G V2{H,k) on £ vertices has £\ labeled hy- 
pergraphs G' = {y{G'),'H{G')) G S2{£) that differ from G by vertex labellings only. Each of those 
hypergraphs has one corresponding mapping vr that maps its £ vertex labels into vertex labels of 
hypergraph G G V2{H, k). 

Then, combining ( |2.1(]| ) and ( p.ll| ) we obtain 

kq/2 



E 



9{Yr 



^E^ E E n ^-w 

i=q ' G'eS2ie)TT&M{[i]) \h£n{G') 




■k{u) I 



\du{G') 



(2.12) 
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2.2 Estimating the term for each hypergraph G' 



We now fix integer (. and labeled hypergraph G' G S2{t}. Let c be the number of connected 
components in G', i.e. c is a maximal number such that the vertex set V(G') can be partitioned into 
c parts Vi, . . . , Vc such that for each hyperedge h G TiiG') and any j G [c] if V (/i) H Vj 7^ then 
V (^) C Vj. Intuitively, we can split the vertex set of G' into c components such that there are no 
hyperedges that have vertices in two or more components. By definition of degree Yliv<^v{G') ~ 1^ 
and d^>2 for ah v G V(G'). 

We use a canonical ordering h^^\ . . . , h^''^ of the hyperedges in H{G') that will be specified later in 
Lemma (This canonical ordering is distinct from and should not be confused with the ordering 
of the hyperedges inherent in a labeled hypergraph.) We iteratively remove hyperedges from the 
hypergraph G' in this order. Let G'^ = (V^,^^) be the hypergraph defined by the hyperedges 
Ti'g = h^^\ . . . , h^^^ and vertex set V'g = Uheu'^V (h). In particular G'l is identical to G' except for 
the order of the hyperedges. Let Vg be the vertices of the hyperedge h^^^ that have degree one in 
the hypergraph G'g, i.e. V^+i = Vg\Vs. By definition, < \Vs\ < q. Moreover, < \Vs\ < q — I 
for s = 1, . . . ,k — chy Lemma 4.3 since the hyperedge h^'^^ must be connected with at least one of 



remaining hyperedges. By the properties of the canonical ordering h^^\ . . . , /i^'^) from Lemma 4.3 



we know that the first c edges (set S2 of hyperedges) in that ordering belong to different connected 
components. Since degree of each node is at least two we obtain that Vi = ■ ■ ■ = Vc = 9- 

Analogously, we consider the second canonical ordering h^^^ , • • ■ , h^^^ from Lemma and de- 
fine analogous notions. Let G'g = iV'g,T-L'g) be the hypergraph defined by the hyperedges H'g = 
h^^\ . . . and vertex set = U^g^, V(/i). Let Vg be the vertices of the hyperedge h^^^ that 

have degree one in the hypergraph G'g, i.e. V'g_^_l = V'g\Vg. The first c edges in the second canonical 
ordering h^^'^ , ■ ■ ■ , h^''^ define the set Si of hyperedges that belong to different connected compo- 
nents. Therefore, Vi = • • • = 14 = 0. 



Let Si and 52 be the sets of special hyperedges defined in Lemma Each set Si contains exactly 
one hyperedge per connected component of hypergraph G' and therefore, hyperedges belonging to 
the same Si are disjoint. Let W[ = U/ig^^ V(/i) be the set of vertices incident to hyperedges in Si and 
W2 = U/ig52l^(^) be the set of vertices incident to hyperedges in 52- Note that \W{\ = = qc. 

We apply the standard fact ab < " to the n/ie5iU52 '^■^(h) the last term of the inequality 
( |2.12| ) and obtain 




n I ^ M n I + M n <w i • (^-i^) 



.heSiUS2 / \h€Si / \h<^S2 



9 



We will use the notation du instead of du{G'). By central moment boundness, we estimate 



n E 

tteV(G') 



X 



\du 



Tv{u) I 



X. 



■k(u) 



yU<^V{G')\W[ 



\«GV(G') 

\«eV(G") 



n IE [^'(«)^ 



n ^ [^liu) 



n ^[\^Au)\] 

u&V{G')\W[ 

n iE[|y.(.)|] 

MeV(G')\VKi' 

(2.14) 



where the last inequality uses the inequality E [|X^(m)|] = E [|5^7r(M) — E [^,1(11)] |] < E [|y7r(M)|] + 

|E[y,(„)]| <2E[|y,(„)|]. 

Analogously, 



n 

«6V(G') 



7r(«) I 



< 2^-^^^^^=---^ I n duiww^ [x^(„) 



'aGV(G') 



n E[|X,(„)|](2.15) 
wGV(G')\V1/^ 



Recah [^] = V(G'), = V(G') \ U^-^Vt for s = 1, . . . , A: and Fi = • • • = 14 = 0- Analogously, 
= V(G') \ Ul-^V^t for s = 1, . . . , fc and Vi = • • • = 14 = 0- For each s = c + 1, . . . , A: - c, we will 
use the notations 



n ^ [^^(.) 
n JE [^liu) 



n ^[\YAu)\ 
ueVs{G')\w^ 



(2.16) 



Therefore, combining inequalities ( 2.13| ), ( 2.14 ), ( ^.151) and notations ( ^.16 ), for each graph G' S 
52 (^), we obtain 



Y. n n ^ 

7reM([£]) \heH{G') I \«GV(G') 

< 2^-2'?c-l^gfe-<?c-£ I -Q 

\«GV(G') 



\x. 



Tr{u) I 



|rf^,(G') 



E n 

+ E fn 

7rgA/([^]) \/i652 



W, 



Ah) ) • 1 n ^-c*) I ^i(^) 
'-(h) I • 1 n "^-w I ^i(^) 



(2.17) 
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We now analyze two terms in the inequality (2.17) separately using different canonical orderings of 
the hyperedges from Lemma U. We consider m/ie5i ^^(h)) " (n/i=fc(<=+i),...,h(fe-<=) ^7r(h)) T^iItt) 



and the corresponding canonical ordering of the hyperedges h^'^~^^\ . . . , h^''\ For each s = c + 
1, . . . ,k — c we obtain 



E n 

7rgAf(V^) \h£Si 



n 



V(h) 



T,(7r) 



n 

s.t. TT extends it' 



E E 



w 



■n{h) 



n - 

,h=h(»+i),...,/i('=-=) 



/ V veVs J 



E n 



n 

vh=h(''+l),...,/i('=-'=) 



7r'(h) '^s+liT^') 



E 



7rGA/(V^) V v(^Vs 
.s.t. TT extends tt' 



where we say that vr extends tt' if 7r(z;) = 7r'(f ) for every v in the domain of vr'. 

We now group the sum over tt by the value of 7r{h^^^) = /i G "H. Note that for any fixed mapping 
tt' G M(V^^i) there are exactly \Vs\\ possible mappings vr G M(V5) that extend vr' and map the 
vertex labels of hyperedge h^^^ G G' into vertex labels of the hyperedge h ^ 7i. Let S' = {tt'{v) : 
V G V(/i('')) \ Vs}, which is the portion of 7r(V(/i(''))) that is fixed by vr'. Then 



7reA/(V^) 
s.t. IT extends vr' 



\Vs\\ "^1^ H ^[l^""! 

h£H:V(h)DS' u£Vih)\S' 



< \Vs\l max 

5:|S|=g-|ya| 



E 



Wh 



n 



h(^n:V{h)^S «ev(/i)\s 



For s = — c + 1, . . . , A;, C Si, |V^| = g and we have a similar argument (note that \M{V'j^_^_i] 
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|M(9)| = 1) 



E n 



7r{h) 



n ^ [^'(«) 



n 



Ah) 



s.t. TT extends tt' 



n 



E 



7r(M) 



E 



n 



n ^ 



^2 



E 



-2 

7rGA/(V^) \ ueVs 
s.t. TT extends tt' 



n 



n 



TT'{h) 



n ^ 
n E 



^7r'(u) 



^tt'{u) 



where in the last equahty we used Lemma |4.1| . 

In the end we bound the first term of ( |2.17| ) as follows: 



E-^ n 

Var[g{X)], 



/l=/l{c + l),... 



k—c 



vmgV(G') / 7reA/([£]) \heSi 
< 2^-2'^-iL'?'^-.-M n f n \Vs\lVar[g{X)]\ J] 

^i)GV(G') / \s=k-c+l / s=c+l 

Vt^GVCG') / Vs=l / i=l 



^■!;GV(G') 



(2.18) 



t=l 



where (Tq = c, for t > 1 is the number of indices s = c + 1, . . . ,k — c with (7 — 1 141 = t and 
fit = fit{w^Y). In the last inequality we used the fact that X]s=i|14| = ^ and \Vs\ < q. The 
quantities at must satisfy the equalities Yl^={)i1 ~ O*^* = ^ ^'^d 2(To + = k. 

Using analogous argument for the canonical ordering h^^^ , • • • , h^^^ we show 



.mGV(G') / 7rGA/(M) \heS2 



n I ^l(^) 

ih=h(':+l),...,h('=-=) 



.dGV(G') 



(2.19) 



t=i 
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where cJq = c and o"^, . . . , 0"^ is a different cohection of powers satisfying conditions of the Lemma. 
Combining the inequahties ( p. IT] ), ( |2.1^ ) and (2.1£) we derive 



Yi n "^-w 

7reM{[£]) \h£H{G') 




< max <! 2^-2'/<xo^gfc-q<xo-^ ( 41 Var[g{X)Y°q^ ■ J| ^i^' 

.v£V{G') I t=l 



(2.20) 



where the maximum is over ah non-negative integers (Tq, ai, . . . ,ag satisfying 2aQ + X^j^]^ crt = 
(To = c and i = Y,t=oiQ " ^)^t- 

2.3 Using the Counting Lemma 

We decompose S2{i) as S2{i) = (Jc (!>2 '^(^' where 2 is a vector of i twos and S{£,c,d) is the 
number of vertex and hyperedge labeled hypergraphs with vertex set [i] and k labeled hyperedges 
such that each hyperedge has cardinality q, the number of connected components is c, the degree 
vector is d. (Note that S{i,c,d) depends on k and q as well.) Let a = (do, . . . ,(7q). Combining, 
(|2l^ ) and ( ^ ) we obtain 



E 



kq/2 e/q 



< niax 

l,c,d>2,a 



< niax 

£,c,d>2,a 



=9 c=l d>2 \G'G5(£,c,d) 
1 . 



max < 



2^-2g^o2,'?fe-'?'^o-Var[5(X)]'^o 11 -"^^ N'^ 11 '^'^ 

\t=i / ve[e] 

2"'=+^ • \S{i,c,d)\ ■ 2^-2''-«L'?'=-'?-o-Var[g(X)]'^" (jl^'"*) ^^5'^"'} 



k£ 
2~i] 



2g(fe-2ao)+2£ . i,'/fe-'/'^o-Var[5(X)]'^o • q^ ■ Rf ■ A:«'=-(9-i)c . m 



\s{i,c,d)\Ylydu\ < this 

by counting Lemma 4.4 



where the sum is over d >2 with X^^gj^j c^n = qk and the maximum over a has the same constraints 
as in ( |2.2C| ). The maximums over £, c, and d are over the same sets that those quantities were 
previously summed over. The second inequality follows from the fact that the total number of 
feasible degree vectors d is at most 2'^^'^^ {qk is the sum of all the degrees and we need to compute 
the total number of partitions of the array with qk entries into i possible groups of consecutive 
entries which is 

We now substitute ctq for c, and remove the unreferenced variables c and d from the maximum. We 
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also remove t from the maximum since it is completely defined by the vector o. We continue 



E 



< max I ^ • L5^-5'^°-Var[5(X)]'^o • / • Rf ■ /^^^-(^-D-o . \^ 

< m_ax |i?f • • L'?'=-'?'^o-Var[(7(X)]'^« • . (^H/^r^ | 



< m_ax ^RfLi''-i''°-^Var[g{X)P ■ k^''-^'^-^ 



1^? 



\t=i 



where Rq < Ri < R2 < R3 are some absolute constants, the second inequality uses the fact that 
£\ > {i/eY, and the last inequality is implied by the fact that 



< max 

x>0 V X 



Inequality ( |2.21 ) is precisely the inequality ( |2.8| ) that we needed to prove. ■ 

3 Intermediate moment lemma 

Lemma 3.1 (Intermediate Moment Lemma) We are given n independent central moment 
hounded random variables Y = (Yi, ... ,1^) with the same parameter L > and a general polyno- 
mial g[x) with nonnegative coefficients such that every monomial (or hyperedge) h £ Ti has power 
exactly q. Let = 1^ — E \Yy\ then 



E 



g{Xf < u,a.^{[\lkRlVar[g{X)]] ,umxik'RlL'fit{g,Y)Y 



(3.22) 



where R3 > 1 is some absolute constant and X is the vector of centered random variables defined 



in the Lemma 2.1. 



Proof. First we note that 2cro + crt = k and Ylt=o(l ~ ^^'^t- ~ ^ '^'^V^Yi 

^tat = qk - q{2ao +^crt) + ^tat 



t=i 



t=l i=l 

qk - quo - ^{q - t)at 
t=o 

qk — qao — I. 
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Therefore, (Tq + Ylt=i — ~ (q ~ l)'^o ~ ^- Combining these facts with Lemma 2.1, we derive 



E 



\t=i 



< 



m_ax I {kRlVar[g{X)]y" ■ f[{k^ RlL* fit^' | 

< may: f^(^^kRlVar[g{X)]^ , max (A;*i?^LVt)'' | , 
where the last inequahty is based on the fact that 2ao + Ylt=i = k. ■ 

4 Technical Lemmas 

Lemma 4.1 We are given n independent random variables Xi, . . . , Xn such that E [Xi] = for all 
i £[n]. We are also given a multilinear polynomial f{x) = ^^heHiH) '^hYlv^v{h) with \V{h)\ > 1 
for any h G T-L{H). Then 

Var[f{X)]= W ^[^']- 

hen{H) vevih) 

Proof. Let U = U{H). Clearly E [f{X)] = hence 

Var[f{X)] = E [{f{X) - E [f{X)]f] = E [f{Xf] 



E 



XI Wh JJ 



h&H h'en 



. \veh / 

n ^[^' 



weh' 



n ^[^-] 



ve(h\h')U(h'\h) 



henh'en \vehr\h' 

where the last equality follows because E [Xy] = by assumption. ■ 

Lemma 4.2 We are given n independent random variables Xi,. . . ,Xn such that E [Xi] = for 
all i G [n]. We are also given two multilinear polynomials gi{x) = YlheH{H)''^hY\v^h^'" '^^^ 
92{x) = Y.h(iH(H) ^'h Y\v(^h^^ -^^c/i that Whw'f^ = for any hyperedge h. Then E [gi{X)g2{X)] = 0. 

Proof. The proof is similar to the proof of Lemma [4.1| : 
E[5i(X)52(X)]= E ""^^'^ 

heH(H) h'(^n{H) L \v€h / \veh' 



E E ^^^'f n ^i^"]] i n 

lH{H) h' £H{H) \v£hr\h' / \v£{h\h')Uih'\h) 
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where the final equahty follows because each h, h' term either has Whw'f^i = or else h ^ h' and 
hence at least one E [X^] = factor. ■ 



Lemma 4.3 (Ordering Lemma) We are given a hypergraph G' = {V,T-L) with c connected com- 
ponents and degree of each vertex > 2. We can define two disjoint sets of hyperedges Si and S2 
each containing exactly one hyperedge per connected component of G' , i.e. \Si\ = \S2\ = c, such 
that there exist two canonical orderings h^^^ , • • • , h^'^^ and h^^^ , • • • , h^''^ of the hyperedges % with the 
following properties: 

1. S2 = = {/iC^-^+i),...,/!^} and Si = = 

i.e. the hyperedges from S2 appear first in the canonical ordering h^^\...,h^^^ and last in 
the canonical ordering h^^\ . . . , h^^\ while the hyperedges from Si appear last in the canonical 
ordering h^^\ . . . , h^''^ and first in the canonical ordering h^^\ . . . , h^^'^ ; 

2. for any s = 1, . . . ,k — c the hypergraph Gs induced by the hyperedges h^^\ . . . , h^^^ has exactly 
c connected components; 

3. Analogously, for any s = 1, . . . , k — c the hypergraph Gg induced by the hyperedges h^^\ . . . , h^^^ 
has exactly c connected components. 

Proof. Let C be the line graph of G', i.e. an undirected graph with one vertex for each of the k 
hyperedges of G' and an edge connecting every pair of vertices that correspond to hyperedges with 
intersecting vertex sets. Pick an arbitrary spanning forest J- of C Pick two leaves arbitrarily from 
each connected component of and arbitrarily put one from each component in Si and the others 
in ^2. The existence of at least two leaves in each component follows because all vertices of G' have 
degrees at least 2 and hence each connected component has at least two hyperedges. It is easy to 
see that any tree with at least two vertices has at least two leaves.^ 

We show the construction of h^^\ . . . , h^^^ only; the construction of h^^\ . . . , h^''^ is analogous with 
the roles of Si and 5*2 swapped. 

We pick h^^\ . . . , /i^'^^ iteratively (in that order) as follows. Let Ti denote the subforest of induced 
by vertices T-L \ {h^^^ , ■ ■ ■ , h^^~^^}. We pick /i^*^ to be an arbitrary leaf of J-i subject to the constraint 
that G 52 if 1 < i < c, h^'^ ^ Si U S2 if c + 1 < i < k - c, and h'^^ <^ Si \i k - c + I < i < k. 

For any 1 < i < we assert that: 

1. there is a leaf satisfying the desired constraint available to be /i^*^ and 

2. \i i <k — c there are c connected components of and each contains a vertex (hyperedge 
of G') in Si. 

The second property follows because we always choose a leaf and never choose a vertex from Si. 

For 1 < i < c the first property follows because removing a vertex from a graph cannot make a leaf 
into a non-leaf and every vertex of 5*2 is a leaf in Ti . For c+ \ < i < k — c the first property follows 
because the second property implies that there is a connected component of Ti with at least two 
vertices and hence leaves and at most one can be from 5i and none from 52. ■ 

The next lemma was proven in [|l^ in the setting of general polynomials. We state below a special 
case corresponding to multilinear polynomials. 

^Indeed root each tree arbitrarily; if the root has degree two or more pick arbitrary leaf descendents (in the rooted 
sense) of two neighbors of the root; otherwise pick the root and an arbitrary leaf descendent of the root. 
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Lemma 4.4 (Main Counting Lemma []T^ ]) For any k, q > I, i, c and d> 2 we have 



\s{£,c,d)\ [Ud^i] < 



for some universal constant Rq > 1. 



Lemma 4.5 (Holder's Inequality) Letpi,...,pk G (l,+oo) such that J2i=i^ — 1 then for 
arbitrary collection of random variables on the same probability space the following 

inequality holds 



E 



1=1 



< 



llE[\X, 



i=l 



We will use the following corollary of Holder's inequality. 

Corollary 4.6 (Minkowski's Inequality) Let k be a positive integer and Zi,Z2, ■ ■ ■ , Zm be (po- 
tentially dependent) random variables with E[|Zj|'^] < z'^ for Zi G R^. R follows that 



E 



Cm 
i=l 



(4.23) 



5 General Even Moment Lemma 

Lemma 5.1 (General Even Moment Lemma) We are given n independent central moment 
bounded random variables Y = (Yi, . . . ,K„) with the same parameter L > and a general power q 
polynomial f{x). Let k'>2 be an even integer then 



E 



|/(y)-E[/(y)]|'=] < iae.A(^^kRlVar[f{Y)]^ , max (A:*i?^LVt(/, 1^))' | • (5.24) 



where R4> 1 is some absolute constant. 

Proof. Let weight function w and hypergraph H = {[n],7i) be such that f{Y) = X^/^g-^ Wh n«;ev(h) 
Let = Yy—K [Yy]. Let Ti' denote the set of all possible hyperedges (including the empty hyper- 
edge) with at most q vertices (from V(-ff) = [n]). First we note that 

f(Y) = J2'^'- U {^v + E[Y,]) 

hen v£V{h) 

= E E n i n 

h'eW h£H:V{h)DVih') \v£V{h)\V(h') J \veV{h') 

= T.<' n (5.25) 

h'eW v(£V{h') 



where 



w'h'= E n 

heH:V(h)DV{h') \v€V(h)\V(h') 
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We next group the monomials on the right hand side of ( ^.25 ) by cardinahty and sign of coefficient, 
yielding m < 2q polynomials g^^^ , • • ■ , g^"^^ with corresponding weight functions for all monomials 
w^^\ . . . , w^"^^ and powers qi, . . . , q^- That is, 



/(^)=-{}+E E 4^ n 

1=1 h':\h'\>l v€V(h') 
m 

= E[/(y)] + J]g»(x) 



(5.26) 



1=1 



where {} is the empty hyperedge. We have 



\':V{h')DS v<^Vih')\S 



^ir{w^'\Y) < fir{w' ,Y) = max ^ \wh,\ JJ E[|y^ 

V{h' 

< max > > 

h>) 



' h':V(h')DS heH:Vih)DV{h') 



y^hw n ^[i^-i] n 

veV(h)\V(h') J veV{h')\Viho) 



< 2« max 

S:\S\=r 



(5.27) 



h:V{h)^S \v€V{h)\V(ho) 

where the last inequality follows from the fact that for any h £ Ti the number of different h' G Ti' 



such that V (h) 5 V{h') 5 S is at most 2'^. In addition by Lemma 4.2 we derive 

m 

VarlfiY)] = J^yar[<7«(X)]. (5.28) 



i=l 



For even k > 2 Lemma 3.1 implies that 



E 



E 



g^Hx)'' < max { ( ^^^s^^a^ls^^H^)] ) ,max{k*RfL*^it{w^^,Y))'' \ = . 



Applying Corollary 4^ together with ( 5.27] ) and ( 5.28| ) yields 



E 



\fiY)-E[fiY)]\' 



< E 



< 



^ Zj < max zf 



< max<( ( ^/A;i?|yar[/(y)] ) , max(A;*i?^L*/it)'^' 



where we choose -R4 > 1 such that m 



Proof. [Proof of Lemma 1.5] As in the proof of Lemma 5.1 we write f{Y) = E [/(l^)] + 5''-*^ {X) 
where Xy = — E [1^,]. Let H', m, g^^\ w^^^ and qi be defined as in that proof. Using Lemma [4.1| , 
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the inequality ( ^.27 ) with r = qi and the central moment boundness we get 



hew v£V{h) 

< n i^LE[\X.,\]) 

hew vev{h) 

hew vev{h) 
= {4Ly^fi,^{w^'\Y)fio{w'^^,Y). (5.29) 



Combining Lemma 4.2 and the inequality ( p. 29] ) we get 

m 

Var[f{Y)] = Y,Var[g^^Hx)] 



1=1 



< 2q max{ALY Hr{w^'\Y)^o{w^'\Y) 

re[q] 

< 2g49max4^LVr(i«,y)^o(w^,^) 

relq] 



where the last inequality uses (5.27). 

6 Proof of the Theorem [l.^l 



Now we prove Theorem 1.3 by applying Markov's inequality. 

Proof. By Markov's inequality we derive 

Pr[\f{Y) - E [f{Y)] I > A] = Pr[\f{Y) - E [f{Y)] \' > A^] < nf{y)-nf{Y)] 

Choosing /c* > to be the even integer such that k* £ (K — 2, K] for 

^ . f A^ . / A N^/*^ 

K = mm < — — n r^, — TT , mm 



I.e. 



e^RlVar[f{Y)] ' te[g] 



y/k*RlVar[f{Y)] ^ ^ ^ AJfMEl^ ^ m 

— < 1/e and r < 1/e 

A A 

for all t G [q\. Using inequality ( ^.24| ) from Lemma |5.l| we derive 

E[\f{Y)-E[f{Y)]r] 



Pr[|/(y)-E[/(y)]|>A] < 



A^' 



< max<e > .maxe" a 

*6M 



i/t 



< e^-max<e , maxe VMi*^ 

i6M 



for some universal constant R > R4. This implies the statement of the Theorem. 
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7 Examples of Central Moment Bounded Random Variables 

7.1 Bounded Random variables 

Lemma 7.1 Any random variable Z with \Z — K[Z] \ < L is central moment bounded with param- 
eter L. 

Proof. For any i > 1 we clearly have \Z -E[Z]\' < L\Z - E [Z] 1'-^ hence E [|Z - E [Z] \'] < 
LE [|Z - E [Z] <iLE[\Z -E [Z] . ■ 

7.2 Continuous log-concave random variables 

We say that non-negative function f{x) is log-concave if f{Xx + (1 — X)y) > f {x)^ f {yY""^ for any 
< A < 1 and x,y G M (see |5[ Section 3.5). Equivalently / is log concave if ln/(x) is concave 
on the set {x : f{x) > 0} where ln/(x) is defined and this set is a convex set (i.e. an interval). A 
continuous random variable (or a continuous distribution) with density / is log-concave if / is a 
log-concave function. See |2|, ^ for introductions to log-concavity. 

Schudy and Sviridenko [|l8| proved: 



Lemma 7.2 {^4/ ^ny log-concave random variable X with density f is moment bounded with 
parameter L = j^E[|X|] « 1.44E[|X|]. 

If X is log-concave with density / then X — E [X] clearly has density f{x) = f{x -\- E [X]), which 
is evidently log-concave. Therefore: 

Corollary 7.3 Any log-concave random variable X with density f is central moment bounded with 
parameter L = ^E[\X -E[X] \] ^ 1.44E[|X -E[X] \] < 2.88E[|X|]. 

7.3 Discrete log-concave random variables 

A distribution over the integers . . . ,p^2,P-i,P0jPiiP2, ... is said to be log-concave ||l], |lT| if pfj^i > 
PiPi+2 for all i. An integer- valued random variable X is log-concave if its distribution px = 
Pr [X = x] is. 

The discrete case is a bit trickier than the continuous case since X — E [X] might take non-integer 



values even if X takes integer ones. We therefore can only get inspiration from the proof in [18| 
that discrete log-concave random variables are moment bounded rather than using it as we did in 
the continuous case. 

Lemma 7.4 Let X be a log-concave integer-valued random variable with Pr [X > £] = 1 and 
Pr [X = > for some £ G Z. Let a £M be an arbitrary real such that a < i < a -\- 1. Then 



E 

where L=1 + E\\X - all. 



\X-a\^ 



< kLE 



iX-al"-' 



Proof. Let u be the largest index i such that pi > or infinity if there is no such index. By 
log-concavity we have Pi > for all i £ Z with £ < i < u. Let rj = Pi [X > i]. Note that = 1 
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and when u is finite rj_|_i = for all i > u. We bound 



E 



\X - a\ 



X — a] 



rx+i 



< 



[x — a) 



{i-a)''+ Yl rx[{x - a)" - {x - 1 - a)"] 

x=l+l 
oo 

x=e+i 

oo 

<Y,rxHx-a)''-' 

x=e 



[x-a)'' V 



\x=e 



= E[1 + |X-^|]e[A:| 
< (1 + E[|X - a\])kE 



X -al" 



\X -a 



fc-1 



where the second inequality uses the fact that £ — a < 1 < k and the third inequality follows 
from Chebychev's summation inequality, which applies because Vx/px is a non-increasing sequence 



(Proposition 10 in Q) and k( 



X — a 



,fc-i 



is a non-decreasing sequence. 



Lemma 7.5 Any log- concave integer valued random variable X such that Pr [X =E [X]\ < 1 is 
central moment hounded with parameter 

L = l + max(E [\X -E[X] \ | X > E [X]] , E [|X - E [X] | | X <E [X]]). 

Proof. It follows that Pr [X > E [X]] and Pr [X < E [X]] are both strictly positive, hence by log- 
concavity Pr [X = [E [X]]] and Pr [X = [E [X]] - 1] are both strictly positive. 



Write 



E 



|X - E [X] 



Pr [X > E [X]] E [(X - E [X])''|X > E [X]] Pr [X < E [X]] E [(E [X] - X)'=|X < E [X] 
Pr [X > E [X]] E \{X+ - E [X])'^] Pr [X < E [X]] E [(X- + E [X])'' 



where X_|_ is a random variable with Pr [X+ = x\ = Pr [X = x|X > E [X]] and X_ is a random 
variable with Pr [X_ = x\ = Pr [X = —x\X < E [X]] for integer x. Clearly X_|_ and X_ inherit 
log-concavity from X. We also have X+ > E [X] and X_ > — E [X]. 



We apply Lemma 7A twice, first to X_|_ with a = E [X] and ^ = [E [X]] and second to X_ with 



21 



a = -E [X] and ^ = 1 - [E [X]] , yielding 



E 



|X-E[X] 



Pr[X>E[X]]E (X+-E[X])'' +Pr[X<E[X]]E (X_+E[X]) 



E[X] 



+ 



< Pr [X > E [X]\ k{l + E [\X+ - E [X] |])E 

+ Pr [X < E [X]] k{l + E [|X_ + E [X] |])E [|X„ + E [X] |^'~^ 

< k{l + max(E [|X+ - E [X] |] ,E [|X„ + E [X] |]))- 

• (^Pr[X > E[X]]E |X -E[X] I'^-ijX > + Pr [X < E [X]] E |X - E [X] l^^-^jX < E [X] 

= A;(l + max(E [|X - E [X] I | X > E [X]] , E [|X - E [X] | | X<E[X]]))E f|X - E [X] |''~^ 
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